fix(mcp): prevent leaked semaphore warnings during MCPServerStdio cleanup by 4444J99 · Pull Request #2802 · openai/openai-agents-python

4444J99 · 2026-03-28T14:38:28Z

Summary

Fixes the "leaked semaphore objects to clean up at shutdown" warning that occurs when using MCPServerStdio with MCP servers that use Python multiprocessing internally.

Root cause: During cleanup(), the exit_stack.aclose() unwinds contexts in LIFO order — ClientSession.__aexit__ runs first (cancelling its internal task group), then the transport's __aexit__ runs. Because the ClientSession cancellation kills the stdout_reader/stdin_writer tasks inside the transport, the transport's finally block — which closes stdin and waits for the subprocess — may never execute cleanly. The subprocess is killed without a chance to flush its multiprocessing resources (semaphores, shared memory, etc.), triggering Python's resource_tracker warning.

Fix: Before calling exit_stack.aclose(), explicitly close the session's write stream. This delivers EOF to the subprocess's stdin, giving it a window to release its resources and exit cleanly before the task groups are cancelled. Also reinitialise exit_stack and clear server_initialize_result during cleanup so the same server instance can reconnect.

Test plan

Added test_cleanup_resets_state_for_reconnection — verifies all session state is cleared and exit stack is replaced after cleanup
Added test_reconnect_after_cleanup — verifies a server instance can connect, use tools, cleanup, then reconnect and use tools again
Added test_cleanup_closes_write_stream_before_exit_stack — verifies the write stream close happens before exit stack unwind (the critical ordering invariant)
All 175 existing MCP tests pass
ruff check and ruff format pass

Issue number

Closes #618

Checks

I've added new tests (if relevant)
I've added/updated the relevant documentation
I've run make lint and make format
I've made sure tests pass

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4468bf6f4d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T14:40:43Z

src/agents/mcp/server.py

+                # Reinitialize the exit stack so the same server instance can
+                # reconnect cleanly after cleanup.
+                self.exit_stack = AsyncExitStack()


Preserve original exit stack when cleanup is interrupted

Reinitializing self.exit_stack in finally discards pending cleanup callbacks even when await self.exit_stack.aclose() is interrupted (for example by CancelledError, which this method explicitly re-raises). In that case the old stack may still hold open transport/process contexts, but after reassignment there is no way to retry closing them, so subprocesses can be orphaned and the semaphore leak warning can persist during shutdown. Only replacing the stack after a successful aclose() avoids this resource-leak path.

Useful? React with 👍 / 👎.

seratch · 2026-03-29T00:01:34Z

Thanks for sharing this. I also explored a viable solution for this, but I found that the sufficient fix requires substantial changes across this SDK. So, I don't think the suggested changes here is enough for the issue. See also: #2573 (comment)

4444J99 · 2026-03-30T08:22:04Z

Understood — happy to contribute to the broader fix in #2573 if that's useful.

github-actions · 2026-04-10T02:14:48Z

This PR is stale because it has been open for 10 days with no activity.

4444J99 · 2026-04-14T16:17:25Z

@seratch — following up on your feedback from March 29. A few developments since then:

Fix #618 MCPServerStdio leaked semaphores warning at shutdown #2573 was closed (March 2) without merging, so the broader solution it represented hasn't landed.
fix(mcp): reset AsyncExitStack after cleanup for reconnect (#618) #2882 was just opened today, addressing the AsyncExitStack reset portion of the reconnect problem — this overlaps with part of what this PR does.
This PR (fix(mcp): prevent leaked semaphore warnings during MCPServerStdio cleanup #2802) addresses the root cause more broadly: it closes the write stream before unwinding the exit stack, giving the subprocess a clean shutdown window for its multiprocessing resources (semaphores, shared memory). The exit-stack reset and reconnect support are also included.

You mentioned the fix here isn't sufficient and that substantial changes are needed across the SDK. Could you share more about what that broader scope looks like? I'd like to understand the architectural direction so I can iterate meaningfully rather than guess.

Happy to coordinate with #2882 if that helps reduce duplicate effort.

…ent semaphore leaks When MCPServerStdio.cleanup() tears down via exit_stack.aclose(), the ClientSession exits first (cancelling its task group), then the transport exits. The ClientSession cancellation kills the reader/writer tasks inside the transport before the transport's finally block can close stdin and wait for the subprocess to exit gracefully. This race prevents the subprocess from flushing multiprocessing resources, causing "leaked semaphore" warnings from Python's resource_tracker. Fix: explicitly close the session's write stream before the exit_stack unwind. This delivers EOF to the subprocess's stdin, giving it a window to release its resources and exit cleanly before task groups are cancelled. Also reinitialise the exit_stack and clear server_initialize_result during cleanup so the same server instance can reconnect after cleanup. Closes openai#618

mypy now reports [method-assign] instead of [assignment] for monkey-patched async methods. Update type: ignore comments to use the narrower code so `make typecheck` passes cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

4444J99 · 2026-04-14T21:30:33Z

@seratch — following up on the "substantial changes" feedback. I've traced the dependency chain:

Root cause is in the MCP library, not the SDK:

modelcontextprotocol/python-sdk#1960 — stdio_client has a race condition during shutdown. The stdout_reader task and cleanup code race when the context exits, causing BrokenResourceError and leaked semaphores.
modelcontextprotocol/python-sdk#2268 — an open fix PR addressing this (March 16, no review yet).

What our PR does is bridge the gap at the SDK level: by closing session._write_stream before exit_stack.aclose(), we deliver EOF to the subprocess stdin, giving it a chance to flush resources before task group cancellation. This is admittedly reaching into a private attribute of ClientSession — which is the fragility you flagged.

Two paths forward:

Keep this as a bridge: our fix works now, prevents the semaphore warning for users, and can be removed once mcp >= 1.27 (or whichever version includes Enhance exception handling in MCP server initialization and cleanup #2268) is released. We could add a # TODO: remove once mcp library fixes #1960 comment.
Wait for the library fix: close this PR, and instead bump the mcp dependency lower bound once Enhance exception handling in MCP server initialization and cleanup #2268 lands. No SDK-level workaround needed.

Happy to go either direction based on your preference. If option 1, I can also add a version check to only apply the workaround on mcp < X once the threshold is known.

github-actions bot added bug Something isn't working feature:mcp labels Mar 28, 2026

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

seratch marked this pull request as draft March 29, 2026 00:01

github-actions bot added the stale label Apr 10, 2026

4444J99 and others added 2 commits April 14, 2026 12:33

4444J99 force-pushed the fix/mcp-stdio-semaphore-leak branch from 4468bf6 to 08c555a Compare April 14, 2026 16:36

4444J99 mentioned this pull request Apr 14, 2026

Fix stdio_client BrokenResourceError race condition during shutdown modelcontextprotocol/python-sdk#2268

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mcp): prevent leaked semaphore warnings during MCPServerStdio cleanup#2802

fix(mcp): prevent leaked semaphore warnings during MCPServerStdio cleanup#2802
4444J99 wants to merge 2 commits intoopenai:mainfrom
4444J99:fix/mcp-stdio-semaphore-leak

4444J99 commented Mar 28, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

seratch commented Mar 29, 2026

Uh oh!

4444J99 commented Mar 30, 2026

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

4444J99 commented Apr 14, 2026

Uh oh!

4444J99 commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

4444J99 commented Mar 28, 2026

Summary

Test plan

Issue number

Checks

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

seratch commented Mar 29, 2026

Uh oh!

4444J99 commented Mar 30, 2026

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

4444J99 commented Apr 14, 2026

Uh oh!

4444J99 commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants